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CONTROL OF SIMULTANEOUS SWITCH NOISE FROM MULTIPLE OUTPUTS 
FIELD OF THF TNVENTION 

[0001] The present invention relates to a method and 
apparatus for reducing the transient currents associated with 
simultaneously switched outputs of a semiconductor chip. 

RELATED ART 

[0002] Output signals of a semiconductor chip are 
typically switched simultaneously in response to an output 
clock signal. Such simultaneously switched outputs result in 
large transient currents, which cannot be easily controlled. 
Conventional methods used to control the large transient 
currents associated with simultaneously switched outputs 
include controlling the slew rate of the output signals 
and/or controlling the strength of the output signals. 
However, such methods either require excessive circuitry, or 
reduce the integrity of the output signals. 

[0003] It would therefore be desirable to have an improved 
method and apparatus for reducing the high transient current 
associated with simultaneously switched outputs. 

SUMMARY 

[0004] Accordingly, the present invention reduces 
transient current created during output switching by time 
multiplexing the output switching operation within each clock 
period. A plurality of output clock signals are generated in 
response to an input clock signal, wherein the output clock 
signals are phase-shifted with respect to the input clock 
signal. Each of the phase-shifted clock signals exhibits an 
active (e.g., rising) edge during a single period of the 
input clock signal. Different groups of input/output blocks 
are switched in response to the various phase-shifted clock 
signals, such that output switching occurs at various times 
during the period of the input clock signal. The phase- 
shifted clock signals can be generated with predetermined 
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phase differences or with dynamically determined phase 
differences . 

[0005] In accordance with one embodiment, a digital clock " 
manager generates a plurality of output clock signals, which 
are separated by 90-degree phase differences. For example, a 
first output signal may be synchronous with the input clock 
signal, a second output clock signal may lag the first output 
clock signal by 90 degrees, a third output clock signal may 
lag the second output clock signal by 90 degrees, and a 
fourth output clock signal may lag the third output clock 
signal by 90 degrees. A first set of input/output resources 
are clocked by the first output clock signal, a second set of 
input/output resources are clocked by the second output clock 
signal, a third set of input/output resources are clocked by 
the third output clock signal, and a fourth set of 
input /output resources are clocked by the fourth output clock 
signal. As a result, the transient switching current 
existing at any given time is reduced by a factor of four. 
[0006] In accordance with another embodiment, a digital 
clock manager determines the period of the input clock 
signal. For example, delay elements may be introduced to the 
path of the input clock signal until the resulting output 
clock signal is synchronous with the input clock signal. At 
this time, the delay introduced by the delay elements is 
equal to one period of the input clock signal. The input 
clock signal (or resulting output clock signal) is applied to 
a chain of series-connected programmable delay lines, thereby 
generating a corresponding plurality of delayed clock 
signals. The delay introduced by each of the programmable 
delay lines is selected with respect to the period of the 
input clock signal. Thus, the sum of the delays introduced 
by the programmable delay lines is less than the period of 
the input clock signal. 

[0007] In one embodiment, each of the programmable delay 
lines includes a plurality of delay elements, wherein each of 
the delay elements in the programmable delay lines is 
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identical to each delay element in the digital clock manager. 
In this embodiment, the number of delay elements enabled 
within each of the programmable delay lines is determined by 
dividing the nximber of delay elements introduced by the 
digital clock manager by the number of programmable delay 
lines. As a result, each of the programmable delay lines 
introduces the same delay to the received clock signal. 
[0008] The present invention will become more clearly 
understood in view of the following description and drawings. 

BRIEF DESCRIPTION OF THE DRAWINGS 

[0009] Fig. 1 is a block diagram of a field programmable 
gate array (FPGA) in accordance with one embodiment of the 
present invention. 

[0010] Fig. 2 is a waveform diagram illustrating an input 
clock signal CLKin, associated data values, and output clock 
signals CLKo, CLK90/ CLKiso and CLK270/ in accordance with one 
embodiment of the present invention illustrated by Fig. 1. 
[0011] Fig. 3 is a circuit diagram illustrating a portion 
of an FPGA in accordance with another embodiment of the 
present invention . 

[0012] Fig. 4 is a circuit diagram of a programmable delay 
line in accordance with one embodiment of the present 
invention illustrated by Fig. 3. 

[0013] Fig. 5 is a waveform diagram illustrating an input 
clock signal CLKin, output clock signals CLKo-CLKie and 
associated data values in accordance with one embodiment of 
the present invention illustrated by Fig. 3. 

DETAILED DESCRIPTION 

[0014] Fig. 1 is a block diagram of a semiconductor chip 
100 in accordance with one embodiment of the present 
invention. In the described embodiments, semiconductor chip ' 
100 is a programmable logic device, such as a field 
programmable gate array (FPGA) . However, semiconductor chip 
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100 need not be a programmable logic device. FPGA 100 
includes an array of configurable logic blocks (CLBs) and a 
programmable interconnect structure, which are illustrated as 
block 101, a digital clock manager (DCM) 111, and 
configurable input/output blocks (lOBs) 121-124, 131-134, 
141-144 and 151-154. In general, the elements of FPGA 100 
are largely conventional, and are described in more detail in 
^^Virtex™-II Platform FPGA Handbook", available from Xilinx, 
Inc. As described in more detail below, the configuration of 
FPGA 100 significantly reduces transient currents during 
output switching. 

[0015] FPGA 100 operates as follows in accordance with one 
embodiment of the present invention. First, FPGA 100 is 
configured to implement a desired circuit by programming 
configuration memory cells of the FPGA. DCM 111 is 
configured to provide four output clock signals CLKo, CLK90, 
CLK180 and CLK270 in response to an input clock signal CLKin 
during normal operation of FPGA 100. Although DCM 111 
generates four clock phases in the present embodiment, it is 
understood that DCM 111 can be modified to provide other 
numbers of clock phases in other embodiments. 
[0016] Fig. 2 is a waveform diagram illustrating the input 
clock signal CLKin, associated data values D1-D4, and the 
clock signals CLKo, CLK90, CLKiso and CLK270. As illustrated 
by Fig. 2, the clock signals CLKo, CLK901 CLKiso and CLK270 are 
separated in phase by ninety degrees. In the described 
embodiment, the input clock signal CLK^^ and the clock signal 
CLKo are synchronized by DCM 111. Thus, both the CLK,„ and 
CLKq signals exhibit rising edges at time Tq. One-quarter 
period later, at time Ti, the CLK90 signal exhibits a rising 
edge, such that the CLK90 signal lags the CLKo signal by 90 
degrees. One-quarter period after time Ti (at time T2) , the 
CLK180 signal exhibits a rising edge, such that the CLKiso 
signal lags the CLK^^ signal by 90 degrees. One-quarter 
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period after time T2 (at time T3) , the CLK270 signal exhibits 
a rising edge, such that the CLK270 signal lags the CLK.gj 
signal by 90 degrees. Note that data values Dl[15:0] are 
clocked out of FPGA 100 during the clock period that includes 
times T0-T3. 

[0017] In accordance with one embodiment, various lOBs of 
FPGA 100 are clocked with different clock signals. For 
example; lOBs 121-124 are clocked with the CLKo signal, lOBs 
131-134 are clocked with the CLK90 signal, lOBs 141-144 are 
clocked with the CLKiao signal, and lOBs 151-154 are clocked 
with the CLK270 signal. Thus, four bits of the Dl[15:0] 
value (e.g., Dl[3:0]) are clocked out through lOBs 121-124 at 
time To, four bits of the Dl[15:0] value (e.g., Dl[7:4]) are ' 
clocked out through lOBs 131-134 at time Ti, four bits of the 
Dl[15:0] value (e.g., Dl[ll:8]) are clocked out through lOBs 
141-144 at time T^, and four bits of the Dl[15:0] value 
(e.g., Dl[15:2]) are clocked out through lOBs 151-154 at time 
T3. Thus, only one fourth of the lOBs are clocked at any 
given time. This sxibstantially reduces the transient current 
associated with output switching. Although the clock signals 
are applied to adjacent lOBs in an interleaved manner in the 
illustrated example, this is not required. For example, each 
of the lOBs located along a single edge of FPGA 100 (e.g., 
lOBs 121, 131, 141 and 151) can be coupled to receive the 
same clock signal. 

[0018] Note that an external device attached to FPGA 100 
must receive the input clock signal CLKin, and in response, 
generate clock signals equivalent to the CLKo, CLK90, CLKiso 
and CLK270 signals. The external device must have a first 
set of input circuits coupled to receive the equivalent CLKg 
signal, a second set of input circuits coupled to receive the 
equivalent CLK90 signal, a third set of input circuits 
coupled to receive the equivalent CLKiso signal, and a fourth 
set of input circuits coupled to receive the equivalent 
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CLK270 signal. The first, second, third and fourth sets of 
input circuits are coupled to receive the data signals 
clocked out of lOBs 121-124, 131-134, 141-144 and 151-154, 
respectively. The data values clocked out of lOBs 121-124, 
131-134, 141-144 and 151-154 are then clocked into the first, 
second, third and fourth sets of input circuits of the 
external device in response to the equivalent CLKo, CLK90, 
CLK180 and CLK270 signals, respectively. 

[0019] Fig. 3 is a circuit diagram illustrating a portion 
of a semiconductor chip 300 in accordance with another 
embodiment. In the described embodiment, semiconductor chip 
300 is described as a programmable logic device, such as an 
FPGA (although this is not necessary) . FPGA 300 and FPGA 100 
include similar programmable logic resources. 
[0020] The illustrated portion of FPGA 300 includes lOBs 
301o-301n, DCM 311, programmable delay lines 321i-321n, delay 
select register 340 and arithmetic \anit (AU) 350. DCM 311 
includes delay select circuit 312, delay line 313 and 
multiplexer 314. Delay line 313 includes a plurality (X) of 
delay elements 315i-315x, which are connected in series as 
illustrated. The output terminals of delay elements 315i- 
315x are coupled to input terminals of multiplexer 314. 
[0021] lOB 3OI0 is configured to receive the CLKo signal 
from DCM 311. lOB 301o clocks the input signal IN, and 
output signal Oo in response to the CLKo signal. As 
described in more detail below, the CLKo signal is 
synchronized with the input clock signal CLKin. In other 
embodiments, the CLKq can simply have a fixed phase 
relationship with respect to the CLKjn signal. 
[0022] The CLKo signal is propagated through delay lines 
321i-321n, thereby creating delayed clock signals CLK,-CLK„, 
respectively. The delayed clock signals CLKi-CLKn are 
provided to lOBs 301i-301n, respectively. Thus, lOBs 301i- 
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301n, clock the respective input signals INi-INn and output 
signals Oi-On, in response to delayed clock signals CLKi-CLKn, " 
respectively. 

[0023] In the described embodiment, each of delay lines 
321i-321n is programmed to introduce the same delay (although 
this is not necessairy in all embodiments) . The delay 
introduced by each of delay lines 321i-321n is selected in 
response to a delay control signal M provided by register 
340. That is, the number of delay elements introduced by 
each of delay lines 321i-321n is selected in response to 
delay control signal M, 

[0024] Fig. 4 is a circuit diagram of delay line 321^ in 
accordance with one embodiment of the present invention. In 
this embodiment, delay line 32 li includes series-connected 
delay elements 401i-401z and multiplexer 402. The CLKo 
signal propagates through delay elements 401i-401z, thereby 
creating delayed clock signals CDi-CDz, respectively. The 
CLKo signal and the delayed clock signals CDi-CDz are 
provided to input terminals of multiplexer 402. Delay 
control signal M is provided to control terminals of 
multiplexer 402. Multiplexer 402 routes one of the clock 
signals CLKo, CDi-CDz as the output clock signal CLKi in 
response to delay control signal M. For example, if the 
delay control signal M has a value of "3", then multiplexer 
402 routes the clock signal CD3 as the CLKi signal, thereby 
introducing three delay elements (and three delay periods) to 
the path of the CLKo signal. In this manner, delay control 
signal M controls the delay introduced by delay line 321i. 
Each of delay elements 401i-401z can be implemented by a 
plurality of series connected inverters, or by other well 
known delay circuitry. In the described this embodiment, 
delay lines 3212-321n are identical to delay line 321,. 
[0025] Returning now to Fig. 3, DCM 311 provides the CLK^ 
signal in response to the input clock signal CLK,^. More 
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Specifically, the CLKin signal is applied to delay line 313. 
In response, delay elements 315i-315x provide delayed clock 
signals Ci-Cx, respectively. The CLKin signal and the delayed 
clock signals Ci-Cx are provided to input terminals of 
multiplexer 314. Delay select circuit 312, which is 
described in more detail below, provides delay select value Y 
to control terminals of multiplexer 314. Multiplexer 314 
routes one of the clock signals CLKin, Ci-Cx as the output 
clock signal CLKo in response to delay select value Y. 
[0026] The signal routed by multiplexer 314 is provided as 
the CLKo signal. As described above, the CLKo signal is 
provided to lOB 301o and delay line 321i. The CLKo signal is 
also provided to an input terminal of delay select circuit 
312 within DCM 311. Delay select circuit 312 compares the 
CLKo and CLKin signals, and adjusts the delay select value Y 
until the CLKo signal is synchronized with the CLKin signal. 
That is, delay select circuit 312 adjusts the delay select 
value Y until the delay introduced to the CLKo signal is 
equal to one period of the CLKin signal. The delay select 
value Y identifies the number of delay elements 315i-315x 
introduced to the path of the CLKq signal. Thus, when DCM 
311 is locked, the delay select value Y identifies the number 
of delay elements 315i-315x corresponding with one period of 
the CLKin signal. 

[0027] The number of delay elements (Z) in each of 
programmable delay lines 321i-321n is selected to be equal to 
a siibset of the number of delay elements (X) in delay line 
313. In one embodiment, delay line 313 includes 128 delay 
elements (i.e., X = 128), and each of programmable delay 
lines 321i-321n includes 8 delay elements (i.e., Z = 8) . In 
one embodiment, the number N of delay lines coupled in series 
is selected such that the total number of delay elements in 
the series -connected delay lines 321i-321n equals the total 
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number of delay steps in delay line 313. Thus, in the 
described embodiment, N is equal to 16 (i.e., 128/8). Note 
that the variables Z, X and N can have other values in other 
embodiments . 

[0028] Each of the delay elements in programmable delay 
lines 321i-321n is identical to the delay elements in delay 
line 313. For example, each of the delay elements 315i-315x 
in delay line 313 and each of the delay elements (e.g., 401i- 
401z) in each of delay lines 3 2 1,-3 2 1„ may introduce a signal 
delay of 200 picoseconds. 

[0029] The delay select value Y is also provided to 
arithmetic \init 350. In response, arithmetic logic unit 350 
divides the number of delay elements represented by delay 
select value Y by the niomber (N) of programmable delay 
elements 3 2 li- 3 2 In, thereby creating a delay control value M 
that represents the number of delay elements to be inserted 
by each of the programmable delay lines 321i-321n. For 
example, if delay select value Y indicates that 42 delay 
elements (i.e., delay elements 315i-31542) are introduced to 
the path of the CLKjn signal (i.e., the period of the CLKin 
signal is equal to 42 delay periods), then ALU 350 provides a 
delay control value M representative of the quotient of 42 
and 16, or 2. Note that any fractional portion of the 
quotient is truncated. The delay control value M is stored 
in delay control register 340, and is provided to each of 
delay lines 321i-321n. In the described example, each of 
programmable delay elements 321i-321n introduces two delay 
periods in response to the delay control value M. 
[0030] Fig. 5 is a waveform diagram illustrating the clock 
signals CLKin, CLKo-CLKie and associated data values (e.g., 
Dl[16:0]) in accordance with the described embodiment. As 
shown in Fig. 5, the CLK™ and CLKo signals exhibit rising 
edges at time To, and the CLKi-CLKig signals exhibit rising 
edges at times Ti-Tis, respectively. Delays of about 400 
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picoseconds (the delay associated with two delay elements) 
exist between the rising edges of the successive clock 
signals CLKo-CLKie. 

[0031] As a result, the bits associated with data value Dl 
are sequentially switched out of lOBs 301o-301i6 in a 
"zipper-like" manner during a single cycle of the CLK^^ 
signal. Because these lOBs 301o-301i6 are not simultaneously 
switched, the transient output switching current is greatly 
reduced (e.g., by a factor of 17). 
[0032] Although only one set of lOBs 301o-301n is 
illustrated in Fig. 3, it is imderstood that other identical 
sets of lOBs can be implemented in the same manner on the 
same FPGA. 

[0033] When the teirperature or other operating conditions 
of the FPGA change, the delay select value Y (i.e., the 
number of selected delay elements in delay line 313) may 
change dynamically. In this case, arithmetic logic unit 350 
generates a new delay control value M (as appropriate) in 
response to the new delay select value Y. If a new delay 
control value M is generated (and stored in delay control 
register 340), then each of programmable delay lines 321i- 
3 2 In is adjusted in view of this new delay control value M. 
[0034] Although the invention has been described in 
connection with several embodiments, it is understood that 
this invention is not limited to the embodiments disclosed, 
but is capable of various modifications, which would be 
apparent to one of ordinary skill in the art. For example, 
the number of programmable delay line 321i-321n (Fig. 3) can 
be selected during configuration of the FPGA. That is, each 
lOB can have an associated programmable delay line that may 
be selectively coupled or de-coupled from adjacent 
programmable delay lines during the configuration of the 
FPGA. Thus, the present invention is only limited by the 
following claims. 
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